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Abstract 

The Adaptive Multilevel Splitting algorithm is a very powerful and versatile iterative method 
to estimate the probability of rare events, based on an interacting particle systems. In [3], in a 
so-called idealized setting, the authors prove that some associated estimators are unbiased, for 
each value of the size n of the systems of replicas and of a resampling number k. 

Here we go beyond and prove these estimator’s asymptotic normality when n goes to +oo, for 
any fixed value of k. The main ingredient is the asymptotic analysis of a functional equation on 
an appropriate characteristic function.Some numerical simulations illustrate the convergence to 
rely on Gaussian confidence intervals. 
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1 Introduction 

Many models from physics, chemistry or biology involve stochastic systems for different purposes: 
taking into account some uncertainty with respect to some data parameters, or to allow for 
dynamical phase transitions between different configurations of the system. This phenomenon 
often referred to as metastability is observed for instance when one uses a d-dimensional 
overdamped Langevin dynamics 

dXt = -W (X t )dt + VW'dWt 

associated with a potential function V with several local minima. Here W denotes a d- 
dimensional standard Wiener process. When the inverse temperature f3 increases, the transitions 
become rare events (their probability decreases exponentially fast). 

In this paper, we adopt a numerical point of view, and analyze a method which outperforms 
a pure Monte-Carlo method for a given computational effort in the small probability regime 
(in terms of relative error). Two important families of methods have been introduced in the 
1950s and extensively developed later in order to efficiently address this rare event estimation 
problem: importance sampling, and importance/multilevel splitting - see [11] and [9] for a 
more recent treatment. For more general references, we refer for instance to [12]. 

The method we study in this work belongs is a multilevel splitting algorithms. The main 
advantage of this kind of methods is that they are non-intrusive: one does not need to modify 
the model to go beyond a pure Monte-Carlo approximation scheme. The present method has 
an additional feature: the so-called levels are computed adaptively. To explain more precisely 
the algorithm and its properties, from now on we only focus on a simpler setting, though 
paradigmatic of the rare event estimation problem. 

Let A be a real random variable, and a some fixed, given threshold. We want to compute 
an approximation of the tail probability p := P(X > a). The splitting strategy in the regime 
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when a becomes large consists in introducing the following decomposition of p as a product of 
conditional probabilities: 

¥(X > a)= P(X > a n \X > a n - i)...P(X > a 2 |X > ai )P(X > 01 ), 

for some sequence of levels ai < ... < a n - 1 < a n = a. The common interpretation of this 
formula is that the event that X > a is split in n (conditional) crossing probabilities for X , 
each much higher than p and thus easier to approximate (each independently). 

To optimize the variance, the levels must be chosen such that all the conditional probabilities 
are equal to p 1 ^ 71 , with n as large as possible. However, the a priori knowledge of levels satisfying 
this condition is not available in practical cases. 

Notice that in principle to apply this splitting strategy, one needs to know how to sample 
according to the conditional distributions appearing in the splitting formula. If this condition 
holds, we say that we are in an idealized setting. 

Adaptive techniques, where the levels are computed on-the-fly have been introduced in the 
2000s in various contexts, under different names: Adaptive Multilevel Splitting (AMS) [5], [6], 
[7], Subset simulation [2] and Nested sampling [13]. 

In this paper, we focus on the versions of AMS algorithms studied in [3]. Two parameters 
are required: a number of (interacting) replicas n, and a fixed integer k £ {1,..., n — 1}, such 
that the proportion of replicas that are killed and resampled at each iteration is k/n. The 
version with k — 1 has been studied in [10], and is also (in the idealized setting) a special case 
of the Adaptive Last Particle Algorithm of [14]. 

A family of estimators ( p n,k ) n > 2 ,i<k<n-i is introduced in [3]. The main property established 
there is unbiasedness: E \p n,k ] = p. Moreover, the computational cost is analyzed in the regime 
when k is fixed and n -A +oo. Nevertheless comparisons between different values of k is made 
assuming a non natural procedure: N independent realizations of the algorithm are necessary 
to define an empirical estimator with realizations of p n,k . We remove this procedure by showing 
directly an asymptotic normality result for the estimator p n,k , allowing for the direct use of 
Gaussian asymptotic confidence intervals. 

Other Central Limit Theorems for Adaptive Multilevel Splitting estimators (in different 
parameter regimes) have been obtained in [4], [5] and [8]. 

The main result of this paper is Theorem 2: if k and a are fixed, under the assumption 
that the cumulative distribution function of X is continuous, when n +oo we have the 
convergence in law of y/n(p n,k — p) to a centered Gaussian random variable, with variance 
— p 2 log(p) (independent on k). 

The main novelty of the paper is the case k > 1: indeed when k — 1 the law of the estimator 
is explicitly known (it involves a Poisson random variable with parameter —nlog(p)) and the 
Central Limit Theorem can be derived by hand. More precisely, we prove the asymptotic 
normality of log (p n,k ) : and conclude thanks to the delta-method. However when k > 1, the 
key idea is to prove a functional equation, as introduced in [3], for the characteristic function 
of this random variable; the basic ingredient is a decomposition according to the first step of 
the algorithm. Thus one of the main messages of this paper is that the functional equation 
technique allows to prove several key properties of the AMS algorithm in the idealized setting: 
unbiasedness and asymptotic normality. 

The paper is organized as follows. In Section 2, we introduce the main objects: the idealized 
setting (Section 2.1) and the AMS algorithm (Section 2.2). Our main result (Theorem 2) is 
stated in Section 2.3. Section 3 is devoted to the detailed proof of this result. Finally Section 4 
contains a numerical illustration of the Theorem. 


2 Adaptive Multilevel Splitting Algorithms 

2.1 Setting 

Let X be some real random variable. We assume that X > 0 almost surely. We want to 
estimate the probability p = P(X > a), where a > 0 is some threshold. When a goes to Too, p 
goes to 0 and we have to estimate a rare event. More generally, we introduce the conditional 
probability for 0 < x < a 

P(x) =P(X > a\X > x). (1) 

We notice that p = P( 0) and that P(a) = 1. 

The following assumption is crucial: 
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Assumption 1 Let F denote the cumulative distribution function of X. We assume that F 
is continuous. 

2.2 Algorithm 

The algorithm depends on two parameters, fixed once and for all: 

• the number of replicas n; 

• the number k £ {1,..., n — 1} of replicas that are resampled at each iteration. 

The other necessary parameters are the initial condition x and the stopping threshold a. 
In the sequel, when we consider a random variable X \, the subscript i denotes the index in 
{1,..., n} of a particle, and the superscript j denotes the iteration of the algorithm. 

In the algorithm below and in the following, we use classical notations for k -th order 
statistics. For Y = (Yi,... ,Y n ) independent and identically distributed (i.i.d.) real valued 
random variables with continuous cumulative distribution function, there exists almost surely 
a unique (random) permutation a of {l,...,n} such that Y a ( i) < ... < Y a ( n y For any 

k £ {l,...,n}, we then use the classical notation Y(k) — Ya(k) to denote the k -th order 

statistics of the sample Y. 

Algorithm 1 (Adaptive Multilevel Splitting) 

Initialization: Define Z° = x. Sample n i.i.d. realizations Xf,..., X®, with the law C(X\X > 
x). 

Define Z 1 = X® k y the k-th order statistics of the sample X° — (X°,..., X°) ; and a 1 the 

(a.s.) unique associated permutation: < • • • < X ° ± ( n ). 

Set j — 1. 

Iterations (on j > 1): While Z j < a: 

• Conditionally on Z j , sample k new independent random variables (Yf ,..., Y£), according 
to the law C(X\X > Z j ). 

• Set 

X i ITU ■(,) */(^r*(»)<* 

[xp 1 */ {<r )**(*)> k. 

In other words, the particle with index i is killed and resampled according to the law 
jC(X\X > Z^) if Xi~ x < Z\ and remains unchanged if X\~ x > ZF Notice that the 
condition (a j )~ 1 (i) < k is equivalent to i £ {cr J '(l),..., cr j (k)}. 

• Define = X 3 ^ k y the k-th order statistics of the sample X J = (Af,... : X^), and cF +1 

the (a.s.) unique associated permutation: XF +1 ^ < ... < Jh +1 ^. 

• Finally increment j <— j 1. 

End of the algorithm: Define J n,k (x) — j — 1 as the (random) number of iterations. Notice 
that J n,k (x) is such that Z jn,k ^ < a and Z jn,k ^ +1 > a. 

The estimator of the probability P(x) is defined by 



with 

C n,k (x) = — Card |z; X/ ^ > a| . (3) 

When x = 0, to simplify notations we set p n,k = p n,k ( 0). 

2.3 The Central Limit Theorem 

The main result of the paper is the following asymptotic normality result. 

Theorem 2 Under assumption 1, we have the following convergence in law, when n —>> +oo, 
for any fixed k and a: 

Vn(p n,k ~ p) ->■ V(0, -p 2 log (p)). (4) 
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Notice that the asymptotic variance does not depend on k. This result can be used to define 
asymptotic normal confidence intervals, for one realization of the algorithm and n —> +oo. 
However, the speed of convergence is not known and may depend on the estimated probability 
n and on the parameter k. 

Thanks to Theorem 2, we can study the cost of the use of one realization of the AMS 
algorithm to obtain a given accuracy when n +oo. In [3], the cost was analyzed when using 
a sample of M independent realizations of the algorithm, giving an empirical estimator, and 
the analysis was based on an asymptotic analysis of the variance in the large n limit. 

Let e be some fixed tolerance error, and a > 0. Denote r a such that P (Z £ [—r a ,r a ]) = 1 — a, 
where Z is a standard Gaussian random variable. 

Then for n large, an asymptotic confidence interval with level 1 — a, centered around p, is 



Then the e error criterion \p n,k — p\ < e is achieved for n of size — . 

However, in average one realization of the AMS algorithm requires a number of steps of the 
order — nlog(p)//c, with k random variables sampled at each iteration. Another cost comes 
from the sorting of the replicas at initialization, and the insertion at each iteration of the k new 
sampled replicas in the sorted ensemble of the non-resampled ones. Thus the cost to achieve 
an accuracy of size e is in the large n regime of size nlog(n)(— p 2 log(p)), not depending on k. 

This cost must be compared with the one when using a pure Monte-Carlo approximation, 
with an ensemble of non-interacting replicas of size n: thanks to the Central Limit Theorem, 
the tolerance criterion error e is satisfied for n of size Despite the log(n) factor in 

the AMS case, the performance is improved since p 2 log(p) = o (p) when p 0. 

Remark 1 Unlike in [3], here we are not able to compare the different choices for k, since at 
the limit n —> +oo all the involved quantities do not depend on k. 

3 Proof of the Central Limit Theorem 

The proof is divided into the following steps. First, thanks to Assumption 1, we reduce our 
study to the case when X is distributed according to the exponential law with parameter 
1: P(A > z) — exp(— z) for any z > 0. Second, we explain why it is simpler to look at 
log (p n,k (x)) instead of p n,k (x ), and to use the delta-method to get the result. The third step is 
the introduction of the characteristic function of log (p n,k (x))-, following the definition of the 
algorithm, we prove that it is solution of a functional equation with respect to x, which can be 
transformed into a linear ODE of order k. Finally, we analyze the solution of this ODE in the 
limit n —)> +oo. 

3.1 Reduction to the exponential case 

We start with a reduction of the study to the case when the random variable X is exponentially 
distributed with parameter 1. It is based on a change of variable with the following function: 

A(ai) = — log(l — F(x)). (5) 

It is well-known that F(X) is distributed according to the uniform law on (0,1) (thanks to the 
continuity Assumption 1), and thus A (A) is exponentially distributed with parameter 1. In 
Section 3 (see Corollary 3.4) of [3], it is proved that, the law of the estimator p n,k is equal to 
the one of q n,k , which is the estimator defined with the same values of the parameters n and 

k , but with two differences: the random variable is exponentially distributed with parameter 

l, and the stopping level is A(a). Notice that E[g n,fe ] = exp(—A(a)) = 1 — F(a) = p (by the 
unbiasedness result of [3]). 

Since the arguments are intricate, we do not repeat them here and we refer the interested 
reader to [3]; from now on, we thus assume: 

Assumption 3 Assume that X is exponentially distributed with parameter 1; we denote 
C{X) = £{1). 

As a consequence, it is enough to prove the following result: 
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Proposition 1 When n +oo 

y/n(p n ’ k — p) —>• A/"(0, aexp(—2a)). (6) 

We will need the following notations: 

• f(z) = exp(— z)t z> o (resp. F(z) = (l — exp(— z))t z >o) for the density (resp. the 
cumulative distribution function) of the exponential law 8(1) with parameter 1. 

• fn,k(z) m k( 7 ^)F(z) k ~ 1 f(z) (l — F(z)) n k for the density of the k- th order statistics of a 
sample of size n, made of independent and exponentially distributed random variables, 
with parameter 1. 

Finally, in order to deal with the conditional distributions C(X\X > x) (under Assumption 
3 it is just a shifted exponential distribution x + 8(1) thanks to the loss of memory property of 
the exponential law) in the algorithm, we set for any x > 0 and any y > 0 


f(v,z) = f(y - x), F(y;x) = F(y -x), 
}n,k{y\x) = fn,k(y - x), 


/ y 


fn,k(z)dz , F n , k (y;x) = F n , k (y - x). 


The following result will be very useful: 


d 


-j^fn,l(y,x) = nfn,l(y\x). 

for k e {2,... ,n - 1}, ^ fn,k(y,x) = (n - k + 1) ( f n ,k(y,x) - f n ,k-i(y;x)). 
The proof follows from straightforward computations. 


(7) 


(8) 


3.2 Proof of the CLT 

The first important idea is to prove the Proposition 1 for any initial condition x G [0,a], not 
only for x = 0: 

y/n(p n ' k (x) — p(x)) A7(0, (a — x) exp(—2 (a — x))). (9) 

A natural idea is to introduce the characteristic function of p n,k (x ), and to follow the 
strategy developed in the previous work. Nevertheless, we are not able to derive a useful 
functional equation with respect to the x variable. A better strategy is to look at the logarithm 
of the estimator, and to use a particular case of the delta-method (see for instance [15], 
Section 3): if for some sequence of real random variables (# n )neN and some 0 G R we have 
>Jn(Q n — 0) —»• A7(0,cr 2 ), then y / n(exp(^ n ) — exp(0)) A7(0, exp(2#)cr 2 ) - both in distribution 

when n —> +oo. 

We thus introduce for any t G R and any 0 < x < a 

4>n,k(t,x) := E[exp(*<v / n(log(j5 Tl ’ fc (a:)) - log(p(x))))j. (10) 

For technical reasons, we also set 

Xn,k(t,x) := E^exp (it^/np n,k (x)^ = exp (ity/n(x - a))<j> n ,k(t,x), (11) 

on which we derive a functional equation, in order to prove the following result, which implies 
Proposition 1 with the choice x — 0. 

Proposition 2 When n —>> +oo ; for any 0 < x < a and any t G R 

<t>n,k{t, x) exp ( - a - ) ) . (12) 

The rest of this section is devoted to the proof of Proposition 2, thanks to the following 
series of results. 
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Lemma 1 (Functional Equation) For any n £ N and any k £ {1,..., n — 1}, and for any 
t £ R, the function x Xn,k(t,x) is solution of the following functional equation (with unknown 
x)' for any 0 < x < a 

x(t,x) = e ttV " los(1 -") f x{t,y)fn,k(y\x)&y (13) 

J X 

+ g eit ^log(l-X) p(5(a:) n ) < < S {X)1 +1) \ (14) 

1=0 

where (S(x)f)i<j< n are iid with law C(X\X > x) and where S(x)(^ is the l-th order statistics 
of this sample (with convention S{x)^ = x). 

Remark 2 The function f>n,k(t,x) is solution of a more complex functional equation: for 
any 0 < x < a 

4>n,k(t;x) = f e tVE1 ° e< ' 1 ~ k/n ' , e~ lt ' /E( ' x ~ v) 4> n! k(t;y)fn,k(y,x)dy 

J X 

k-l 

+ ^ e -«v^(x-a) e « v ^log(l-l) p(s , (;r) n ) < a < ^ 

1=0 

The advantage of the equation on Xn,k is that we are able to deduce that x > Xn,k(t,x) is 
solution of a linear ODE with constant coefficients. On x (f n ^(t,x ) the ODE would not have 
constant coefficients and would be more difficult to express. 

Remark 3 If instead of considering log (p n,k )(x) we try to prove directly the central limit 
theorem onp n,k (x), the same strategy leads to a functional equation where on the left-hand side 
we have the parameter t, and on the right-hand side, in the integrand we have the parameter 
t(l — k/n). 

Proof. The idea (like in the proof of Proposition 4.2 in [3]) is to decompose the expectation 
according to the value of the first level Z 1 = On the event {Z 1 > a} = {J n,k (x) = 0}, 

the algorithm stops and p n,k (x) = for the unique l £ {0,..., k — 1} such that S(x)™^ < 
a < S(x)™ l+1 y Thus 


^ e itVn log(p n 




k (x) 


=o ] = £ 


0 ity/n log(1 — fi') 


P(5(x)f 0 < a < S(x)” f+1) ). (15) 


If Z 1 < a, for the next iteration the algorithm restarts at the initial condition Z 1 , and 


E[h 


ity/n log(p n ’ fe O)) t 


\Ljn,k( x 


= E 


)>o] 


Z ]lz !<a 


= e «VHio g (i-|) E [e[ e «^ log (p n ’ ,5( z 1) )| Z i]i z i <a ] 

= [*„,*(*, Z 1 ) t z i <a ] 

= e* t%AI i°g(i-») f Xn,k(t, y)fn,k(y,x) dy. 

J X 


With the two identities and the definition (11) of Xn,k, we thus obtain (13). 


(16) 


□ 


We exploit the functional equation (13) on x Xn,k(t,x), to prove that this function is 
solution of a linear ODE with constant coefficients. 

Lemma 2 (ODE) Let n and k £ {1,... ,n — 2} be fixed. There exist real numbers ii n,k 
and (rm k )o<m<k-i, depending only on n and k, such that for any fixed t £ R, the function 
x Xn,k(t,x) satisfy the following Linear Ordinary Differential Equation (ODE) of order k: 
for x £ [0, a] 


d^_ 

dx k 


Xn,k(t,x) 


_ ity/n log(l- 


^y n ' k Xn,k(t, x) + ^ r ™ k fi^Xn,k(t,x). 


dx 71 


(17) 
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The coefficients /a n,k and (r^)o<m<k-i satisfy the following properties: 

H n ' k = (— T) k n ... (n — k + 1) 

k-x 

X k — r^ 1 * X m — (A — n )... (A — n + k — 1) for all X £ ! 


(18) 


Proof. The proof follows the same lines as Proposition 6.4 in [3]. We introduce the following 
notation 

k-1 

e».fc(t,aO ^ c «VS Io g( 1 -i) P ( 5(a .)n ) < a < ,s'(,r)'; . 

1=0 

Then by recursion, we can prove that for 0 < l < k — 1 and for any x < a and t £ R 

f- l {Xn,k{t,x)-Q n ,k{t,x)) = /z j»> fc e * t ' / ” log(1 -£ ) J Xn,k(t,y)fn,k-i(y;x)dy 


+ T (Xn.fc(^x) - e n ,k(t,x)) . 


(19) 


The idea for the recursion is to differentiate (19) with respect to x and to use (8). The 
coefficients satisfy the following recursion (notice that they do not depend on t): 


Po’ K = i,fi£ = -(n-k + i + i)tf’ k ; 
r o,f+i = ~{n - k + l + l)r”f, if l > 0, 

Cm+i = C-i ,l ~{n-k + l + 1 1 < m < l, 

n,k _ 

r z,z ~ - 1 -' 


( 20 ) 


Differentiating once more when l = k — 1, we obtain 


^ (Xn,fe(t,^) - 0n,fc(M)) = M n,fc e ZtV ^ l0S(1 ^Xn,fe(^) 


+ ^2 r ™ k -lf-^ ( Xn,k(t,X ) - ©n,/c(t,x)) , (21) 


with := /x^ ,/c and r^ k := r r f k k . These coefficients are the same as in [3]; in particular it 
was proved there that the claimed properties are satisfied (using the unbiasedness result for 
the estimation of the probability, see Section 6.4 there). The polynomial equality of (18) is in 
fact equivalent to the following identity: for all j 6 {0,..., k — 1} 


r jk 2 :_‘ r i m 

^exp((n - k + j + 1)(* - a)) = ^ r% k exp ((n - k + j + l)(x - a)). 


k -1 


Using the expression of the order statistics in the exponential case, S n ,k(t, .) is a linear 
combination of the exponential functions z exp(— nz ),..., exp(—(n — k + l)z); therefore 

ik ^ — 1 im 


and thus (21) gives (17). 

□ 

To conclude, it remains to express the solution of (17), and to analyze the asymptotics 
n —>■ +oo. Since the ODE is of order k, in order to uniquely determine the solution, more 
information is required: for instance we need to know the vector of the derivatives of order 
0,1,..., /c — 1 of x i —y Xn,k(t , x) at some point. From the functional equation (13), we are able 
to show the following asymptotic result at point a, when n —>• +oo. 

Lemma 3 (Initial condition) For any fixed k £ {1, ..., } and any t £ M, we have 


Xn,k(t,a) = 1 
J^nXn,k(t,x) | 


I x=a n—^oo 


= <>(&> 


if m £ {1,..., k — 1} . 


( 22 ) 
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Proof. The equality Xn,k(t,a) — 1 is trivial, since p n,k (a) = 1. From (19) and (21), we 
immediately get (by recursion) that for 1 < m < k — 1 

d m | d m | 

—Q n , k {t,x)\ x=a . 

We introduce the following decomposition 


®n,k(t,x) — YXe 


\t\fri log(l — f ) 


P(5'(x)p ) < a < 5(x)(5^)) 


= ^ (e^iogd-i) - l) 

z=o 

k — 1 

+ J2nS(x)? l) <a<S(x)? l+1) ) 

1=0 

— • VC) T 1 Pn,k (&5 ^0 7 

where denotes the cumulative distribution function of the Lth order statistics (with the 
convention F n $(a\x) — 1 for x < a). 

Thanks to (8) and a simple recursion on l, it easy to prove that for any 0 < l < k and any 

m > 1 

d m „ , J 


which immediately yields 


; F nj i(a-,x)\ = 0(n m ), 


r fl n ,/e(t,x) 0(—=)n ri 

x=a n^oo Wfl 


In fact, it is possible to prove a stronger result: if 1 < l < k and 0 < m < l then 

d 171 I 

— FnAa - x )\ x=a = o, 

by recursion on l and using (8) recursively on m. We thus obtain for 1 < m < k — 1 


-(l - F n . k (ayx)) = 0. 


This concludes the proof of the Lemma. 


The last useful result is the following: 

Lemma 4 (Asymptotic expansion) Let k G {1,..., } and t G M be fixed. Then for n large 
enough, we have 


Xn,k(t,x) = ^2ri l n,k(t)e 


\ l n k {t)(x — a) 


for complex coefficients satisfying: 


and for 2 < l < k 


^n,k(t) ~ ityfn + \ T o(l), 

n— >oo 

Vn,k if) 1; 

n—>oo 

7 i 2 -K(l — 1) 

>?n,k{t) ~ n(l-e s ), 

n—>oo 

0. 


Proof. We denote by (A^ fc(£))i<z<fc the roots of the characteristic equation (with unknown 
AGC): 

ip A)...(n /c T 1 A) ity/n iog(i — ^-) _ q 

n...(n — k + 1) 

By the continuity property of the roots of a complex polynomial of degree k with respect to its 
coefficients, we have 

^n,k(t) -> Aoo, 

Tl n—>- oo 
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where (A^(t))i</<fc are the roots of (1 — X) k — 1: thus, we get A* k (t) — o(n), 

’ n —>oo 

j i27T(l-l) 

K,k(t) ~ n(l-e k ). 

n—>oo 

To study more precisely the asymptotic behavior of A* fc (t), we postulate an ansatz 

An,feW = c t ^n + d t + o( 1), 

n—>oo 

and identify the coefficients ct = it and dt — t 2 /2 thanks to 


1- -j= - — + 
\Jn n 


y = m 

/ n—>oo yn n y n J 


ity/nl og(l—£) _ ^ _ 


itk t 2 k 2 (1 

—j= - 9 -h o ( — 

oo yn zn 


In particular, for n large enough, (X l nk (t))i<i<k are pairwise distinct, and (24) follows. 

The coefficients (Vn,k(t))i<i<k are then solutions of the following linear system of order k: 


Vn,k( f ) + - + VnA*) = Xn,k(a), 

+ ••• + = £ £;Xn,k(t, o), 

V n,k(t) (K,k(t)) + - + VuA 1 ) (^n,k(i)) = y=T ^T=T Xn,k(t, a). 


(27) 


Using Cramer’s rule, we express each r] l nk (t) as a ratio of determinants (the denominator 
being a Vandermonde determinant is non zero when n is large enough). For l G {2,..., k}, we 
have 

j m _ det(M^ fc (t)) 

Vn,k\t) /_y 


v(aLw,---,aL(0) 


n—>-+oo 


0 


where 


A< fc (i) = 


A n,k(t) 


—2 

A n,k(t') 




1 




(A*, fc (*)) fc_ V 


is such that det(M^ fc (£)) —>• 0 (since A n fe (£) —>• 0), and the denominator is given by a 

’ n —> + oo 

Vandermonde determinant v(a n t k(t), ■ ■ ■, A^ >fc (t)) —> v(A^o(t),..., A^(t)) ^ 0. 

Finally, = 1 - 2 J 7 ^ fc (i) 


n—>-+oo 


□ 


The proof of Proposition 2 is now straightforward, recalling from (10) and (11) the relation 
4>n,k(t,x) = exp(— ity/n(x — a))xn,k(t, x) between the characteristic function and the 

auxiliary function Xn,k, and taking the limit n +oo thanks to Lemma 4. 


4 Numerical results 

In this Section, we provide some numerical illustrations of the Central Limit Theorem 2. We 
apply the algorithm in the idealized setting with an exponentially distributed random variable 
with parameter 1; indeed, we have seen that in the idealized setting, in law the algorithm 
behaves like this, if the threshold is suitably chosen. 

In the simulations we present, the probability we estimate is exp(—6), which is approximately 

2 . 10 -3 . 

In figure 1, we Lx the value k = 10, and we show histograms for n = 10 2 ,10 3 ,10 4 , 
with different values for the number M independent realizations of the algorithm, such that 
nM = 10 8 (we thus have empirical variance of the same order for all cases). When n increases, 
the normality of the estimator is conhrmed. In Lgure 1, we give Q-Q plots, where the empirical 
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Figure 1: Histograms for k — 10 and p = exp(—6): n = 10 2 ,10 3 ,10 4 from left to right 





Figure 2: Q-Q plot for k = 10 and p = exp(—6): n = 10 2 ,10 3 ,10 4 from left to right 



quantiles of the sample are compared with the exact quantiles of the standard Gaussian random 
variable (after normalization). 

In Figure 3, we show histograms for M — 10 4 independent realizations of the AMS algorithm 
with n = 10 4 and k £ { 1 , 10,100}; we also provide associated Q-Q plots in figure 4, which 
prove that in this regime the normality assumption seems to be satisfied for all values of k. 


Figure 3: Histograms for n = 10 4 and p = exp(—6): k = 1,10,100 from left to right 





Figure 4: Q-Q plot for n = 10 4 and p = exp(—6): k = 1,10,100 from left to right 
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